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1 INTRODUCTION 

At the present time, the most commonly accepted definition of a complex system 
is that of a system containing many interdependent constituents which interact 
nonlinearly [J. Therefore, when we want to model a complex system, the first 
issue has to do with the connectivity properties of its network, the architecture 
of the wirings between the constituents. In fact, we have recently learned that 
the network structure can be as important as the nonlinear interactions between 
elements, and an accurate description of the coupling architecture and a char- 
acterization of the structural properties of the network can be of fundamental 
importance also to understand the dynamics of the system. 



1 The definition may seem somewhat fuzzy and generic: this is an indication that the notion 
of a complex system is still not precisely delineated and differs from author to author. On the 
other side, there is complete agreement that the "ideal" complex systems are the biological 
ones, especially those which have to do with people: our bodies, social systems, our cultures 
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In the last few years the research on networks has taken different direc- 
tions producing rather unexpected and important results. Researchers have: 1) 
proposed various global variables to describe and characterize the properties of 
real-world networks; 2) developed different models to simulate the formation and 
the growth of networks as the ones found in the real world. The results obtained 
can be summed up by saying that statistical physics has been able to capture 
the structure of many diverse systems within a few common frameworks, though 
these common frameworks are very different from the regular array, or the ran- 
dom connectivity, previously used to model the network of a complex system. 

Here we present a list of some of the global quantities introduced to char- 
acterize a network: the characteristic path length L, the clustering coefficient 
C, the global efficiency E s i b, the local efficiency E\ oc , the cost Cost, and the 
degree distribution P{k). We also review two classes of networks proposed: small- 
world and scale-free networks. We conclude with a possible application of the 
nonextensive thermodynamics formalism to describe scale-free networks. 



In ref. [|l6| Watts and Strogatz have shown that the connection topology of 
some biological, social and technological networks is neither completely regular 
nor completely random. These networks, that are somehow in between regular 
and random networks, have been named small worlds in analogy with the small 
world phenomenon empirically observed in social systems more than 30 years 
ago [[jj], |l3) . In the mathematical formalism developed by Watts and Strogatz a 
generic network is represented as an unweighted graph G with N nodes (vertices) 
and K edges (links) between nodes. Such a graph is described by the adjacency 
matrix {a^}, whose entry a%j is either 1 if there is an edge joining vertex i to 
vertex j, and otherwise. The mathematical characterization of the small-world 
behavior is based on the evaluation of two quantities, the characteristic path 
length L and the clustering coefficient C. 



The characteristic path length L measures the typical separation between two 
generic nodes of a graph G. L is defined as: 



where dij is the shortest path length between i and j, i.e. the minimum number of 
edges traversed to get from a vertex i to another vertex j. By definition dij > 1, 
and dij = 1 if there exists a direct link between i and j. Notice that if G is 
connected, i.e. there exists at least one path connecting any couple of vertices 



2 SMALL-WORLD NETWORKS 



2.1 THE CHARACTERISTIC PATH LENGTH 
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with a finite number of steps, then is finite Vi ^ j and also L is a finite 
number. For a non-connected graph, L is an ill-defined quantity, because it can 
diverge. This problem is avoided by using -E g i b in place of L. 

2.2 THE CLUSTERING COEFFICIENT 

The clustering coefficient C is a local quantity of G measuring the average 
cliquishncss of a node. For any node i, the subgraph of neighbors of i, Gi is 
considered. If the degree of i, i. e. the number of edges incident with i, is equal 
to ki, then Gi is made of fej nodes and at most fcj(fcj — l)/2 edges. C$ is the 
fraction of these edges that actually exist, and C is the average value of Cj all 
over the network (by definition < C < 1): 

pfrl 1 n _ # of edges in G; 

The mathematical characterization of the small-world behavior proposed by 
Watts and Strogatz is based on the evaluation of L and C: small- world net- 
works have high C like regular lattices, and short L like random graphs. The 
small-world behavior is ubiquitios in nature and in man-made systems. Neural 
networks, social systems [Q as the collaboration graph of movie actors []l6| or 
the collaboration network of scientists |Q , technological networks as the World 
Wide Web or the electrical power grid of the Western US, are only few of such 
examples. To give an idea of the numbers obtained we consider the simplest 
case of the neural networks investigated, that of the C. elegans: this network, 
represented by a graph with N — 282 nodes (neurons) and K = 1974 edges 
(connections between neurons), gives L — 2.65 and C = 0.28 Jig] . It is also im- 
portant to notice that a network as the electrical power grid of the western US, 
can be studied by such a formalism only if considered as an unweighted graph, 
i.e. when no importance whatsoever is given to the physical length of the links. 



3 EFFICIENT AND ECONOMIC BEHAVIOR 

A more general formalism, valid both for unweighted and weighted graphs (also 
non-connected), extends the application of the small- world analysis to any real 
complex network, in particular to those systems where the euclidian distance 
between vertices is important (as in the case of the electrical power grid of western 
US), and therefore too poorly described only by the topology of connections J|. [|] . 
Such systems are better described by two matrices, the adjacency matrix {a^ } 
defined as before, and a second matrix {£ij} containing the weights associated 
to each link. The latter is named the matrix of physical distances, because the 
numbers lij can be imagined as the euclidean distances between i and j. The 
mathematical characterization of the network is based on the evaluation of two 



Santa Fe Institute. 



February 1, 2008 9:04 p.m. 



latora page 4 



The Architecture of Complex Systems 



quantities, the global and the local efficiency (replacing L and C), and a third 
one quantifying the cost of the network. Small worlds are networks that exchange 
information very efficiently both on a global and on a local scale Q . 

3.1 THE GLOBAL EFFICIENCY 

In the case of a weighted network the shortest path length d^ is defined as the 
smallest sum of the physical distances throughout all the possible paths in the 
graph from i to j Q The efficiency in the communication between vertex i and 
j is assumed to be inversely proportional to the shortest path length: = 1/dij. 
When there is no path in the graph between i and j, dij = +oo and consistently 
€ij = 0. Suppose now that every vertex sends information along the network, 
through its edges. The global efficiency of G can be defined as an average of e^: 



-Eglob(G) 

VI \ — II V I \ — I 



N(N-l) N{N-l)^_d l3 



Such a quantity is always a finite number (even when G is unconnected) and 
can be normalized to vary in the range [0,1] if divided by £ , g i c ,b(G ldeal ) = 
jv(jv-i) Y^ijtjeG T~ > tne efficiency of the ideal case G ldeal in which the graph has 
all the N(N — l)/2 possible edges. In such a case the information is propagated 
in the most efficient way since dij = tij \fi, j. 

3.2 THE LOCAL EFFICIENCY 

One of the advantages of the efficiency-based formalism is that a single measure, 
the efficiency E (instead of the two different measures L and C) is sufficient to 
define the small-world behavior. In fact the efficiency, can be evaluated for any 
subgraph of G, in particular for Gi, the subgraph of the neighbors of i (made 
by ki nodes and at most h(ki — l)/2 edges), and therefore it can be used also to 
characterize the local properties of the graph. The local efficiency of G is defined 
as: 

where the quantities {d' lm } are the shortest distances between nodes I and m 
calculated on the graph Gi. Similarly to E s \ b, also E\ oc can be normalized to 
vary in the range [0, 1] and plays a role similar to that of C Small worlds are 
networks with high -E g i b and high £"i oc . 

3.3 THE COST 

An important variable to consider, especially when we deal with weighted net- 
works and when we want to analyze and compare different real systems, is the 



3 {dij} is now calculated by using the information contained both in {ciij} and in {£ij}. 
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cost of a network |9j. In fact, we expect both -E g i b & n d ^loc to be higher (L lower 
and C higher) as the number of edges in the graph increases. As a counterpart, 
in any real network there is a price to pay for number and length (weight) of 
edges. This can be taken into account by defining the cost of the graph G as the 
total length of the network's wirings: 

Cost(G) = (1) 

Since the cost of G zdeal is already included in the denominator of the formula 
above, Cost varies in [0, 1] and assumes the maximum value 1 when all the edges 
are present in the graph. In the case of an unweighted graph, Cost(G) reduces 
to the normalized number of edges 2K/N(N — 1). 

With the three variables Sgiobj E\ oc and Cost, all defined in [0, 1], it is possi- 
ble to study in an unified way unweighted (topological) and weighted networks. 
And it is possible to define an economic small world as a network having low 
Cost and high E\ oc and -E g i b (i-e., both economic and small- world) . In figure 
we report an useful illustrative example obtained by means of a simple model to 
construct a class of weighted graphs. We start by considering a regular network 
of N = 1000 nodes placed on a circle (£ij is given by the euclidean distance 
between i and j) and K — 1500 links. A random rewiring procedure is imple- 
mented: it consists in going through each of the links in turn and independently 
with some probability p rewire it. Rewiring means shifting one end of the edge 
to a new node chosen randomly with a uniform probability. In this way it is 
possible to tune G in a continuous manner from a regular lattice (p = 0) into a 
random graph (p = 1), without altering the average number of neighbors equal 
to k = 2K/N. For p ~ 0.02 — 0.04 we observe the small-world behavior: E g i ^ 
has almost reached its maximum value 0.62 while E\ oc has not changed much 
from the maximum value 0.2 (assumed at p — 0). Moreover for these values of p 
the network is also economic, in fact the Cost stays very close to the minimum 
possible value (assumed of course in the regular case p = 0). 
Some examples of applications to real networks. The neural network of the C. el- 
egans has -Egiob = 0.35, E\ oc = 0.34, Cost = 0.18: the C. elegans is an eco- 
nomic small world because it achieves high efficiency both at the global and 
local level (about 35% of the global and local efficiency of the ideal completely 
connected case); all of this at a relatively low cost, with only the 18% of the 
wirings of the ideal graph. As a second example we consider a technological net- 
work, the MBTA, the Boston underground transportation system. The MBTA is 
a weighted network consisting of N = 124 stations and K — 124 tunnels connect- 
ing couples of stations. For such a system we obtain -E g i b = 0.63, E\ oc — 0.03 
and Cost = 0.002. This means that MBTA achieves the 63% of the efficiency of 
the ideal subway with a cost of only the 0.2%. The price to pay for such low-cost 
high global efficiency is the lack of local efficiency. In fact, -Eioc = 0.03 indicates 
that, differently from a neural network (or from a social system), the MBTA is 
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FIGURE 1 The three quantities i? g i b, E\ oc and Cost are reported as function of the 
rewiring probability p for the model discussed in the text. The economic small- world 
behavior shows up for p ~ 0.02 — 0.04 



not fault tolerant, i.e. a damage in a station will dramatically affect the efficiency 
in the connection between the previous and the next station. The difference with 
respect to neural networks comes from different needs and priorities in the con- 
struction and evolution mechanism. When a subway system is built, the priority 
is given to the achievement of global efficiency at a relatively low cost, and not 
to fault tolerance. In fact a temporary problem in a station can be solved in an 
economic way by other means: for example, walking, or taking a bus from the 
previous to the next station. Applications to other real networks can be found 
in ref . flsfl . 



4 SCALE-FREE NETWORKS 
4.1 DEGREE DISTRIBUTION 

Other important information on a network can be extracted from its degree 
distribution P(k). The latter is defined as the probability of finding nodes with 
k links: P(k) — ^ffl , where N(k) is the number of nodes with k links. Many 
large networks, as the World Wide Web, the Internet, metabolic and protein 
networks have been named scale-free networks because their degree distribution 
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follows a power-law for large k 9]. Also a social system of interest for the 
spreading of sexually transmitted diseases [ p"l[ , and the connectivity network of 
atomic clusters' systems Q show a similar behavior. The most interesting fact 
is that neither regular nor random graphs display long tails in P(k), and the 
presence of nodes with large k strongly affects the properties of the network , 
as for instance its response to external factors (j| . In ref. Q Barabasi and Albert 
have proposed a simple model (the BA model) to reproduce the P(k) found in 
real networks by modelling the dynamical growth of the network. The model is 
based on two simple mechanisms, growth and preferential attachment, that are 
also the main ingredients present in the dynamical evolution of the real-world 
networks. As an example, the World Wide Web grows in time by the addition 
of new web pages, and a new web page will more likely include hyperlinks to 
popular documents with already high degree. Starting by an initial network with 
a few nodes and adding new nodes with new links preferentially connected to 
the most important existing nodes, the dynamics of the BA model produces (in 
the stationary regime) scale-free networks with a power-law degree distribution 
§■■ 

P(k) ~ fc~ 7 7 = 3 

The model predicts the emergence of the scale-free behavior observed in real 
networks, though the exponents in the power law of real networks can be different 
from 3 (usually it is in the range between 2 and 3). 

4.2 NONEXTENSIVE STATISTICAL MECHANICS 

A more careful analysis of the shape of P(k) of many of the real networks consid- 
ered evidentiates the presence of a plateau for small k. See for example fig. la of 
Ref.|^] and fig. 2b of Ref. [jn]. We have observed that such a plateau for small k 
and the different slopes of the power-law for large k can be perfectly reproduced 
by using the generalized power-law distribution 

P(k) ~ [l + (q-l)0k]^ 

with two fitting parameters: q related to the slope of the power law for large 
fc, and (3 The generalized probability distribution above can be obtained as 
a stationary solution of a generalized Fokker-Planck equation with a nonlinear 
diffusion term We therefore believe that is possible to rephrase the gener- 
alized Fokker-Planck equation in terms of a generalized mechanism of network 
construction, and to implement a model (more general than the BA model) able 
to reproduce the plateau and the different slopes of P(k). 
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